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The Political Districting Problem is mapped to a g-state Potts model in which the constraints 
can be written as interactions between sites or external fields acting on the system. Districting into 
q voter districts is equivalent to finding the ground state of this g-state Potts model. We illustrate 
this by districting Taipei city in its 2008 Legislature Election. Statistical properties of the model 
are also studied. 

PACS numbers: 89.65.Cd; 89.65.Ef; 89.65Gh; 89.90.+n 

I. INTRODUCTION 

In 1812, Massachusetts governor Elbridge Gerry, got help from his political party by crafting a district and won 
his own election. At the time, someone produced an illustration of the districting and emphasized its similarities 
with a salamander. The term Gerrymander was then coined from putting together Gerry and mander. Nowadays, 
Gerrymandering refers to the practice of drawing district lines to maximize the advantage of a political party. For 
example, a bipartisan gerrymandering is the one in which the districting is to protect incumbents, and a racial or 
ethnic gerrymandering is to dilute or preserve the strength of minorities. 

Political Districting has since become an issue that is always political, controversial and sometimes even ugly. In 
the US, for example, the results of a population census in every ten years may require a voter redistricting in order to 
redistribute the House seats among the states. Politicians have always fought over district boundaries, while the courts 
might consider the problem just too political even to enter. Several constitutional amendments were actually passsed 
in the nineteenth century to prevent Gerrymandering. In nearly half of the states that underwent voter redistricting 
in the 1990s, federal or state courts played an essential role in the redistricting debate and judges actually issued new 
lines in ten states. In this process of redistricting, the courts indeed never used quantitative methods to justify the 
plans. One would wonder if there are more objective methods to perform the redistricting process. 

There are actually mathematical and numerical approaches exist in the literature. Such methods can in principle 
eliminate Gerrymandering by providing well defined steps and constraints. Local search methods include those used 
in Kaiser [1] and in Nagel [2]. An implicit enumeration technique was also developed by Garfmkel and Nemhauser 
[3]. George et.al. [4] studied the problem of determining New Zealand's electoral districts, using a location-allocation 
based iterative method in conjunction with a geographic information system (GIS). 

From a mathematical point of view, the Political Districting Problem belongs to what is known as the Districting (or 
zone design) Problem in which n units are grouped into k zones such that some cost function is optimized, subject to 
constraints on the topology of the zones, etc and has been shown to be NP-Complete [5]. Thus, it is best to be treated 
by some optimization methods. The Districting Problem is a geographical problem which is present in a number of 
geographical tasks such as school districting, design of sales territories, etc. The constraints of the Districting Problem 
are very similar to that of the Clustering Problem in optimization. Let the set of n initial units be X — x±, x%, x n , 
where xt is the i-th unit and let the number of districts be k. Let Z\ be the set of all the units that belong to district 
Z. Then 

Zi±% ,i = l,...,k , 

Zi n z s = , i ? j , 

\jf =1 Z i = X . (1.1) 



There is an additional constraint in the Districting Problem, namely, the constraint of contiguity which makes the 
problem somewhat more complicated. It constrains the set of possible solutions to the problem that assures contiguity 
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between the units within the designed district. Contiguity here means that every unit in a district is connected to 
every other unit through units that are also in the district. An important optimization criterion in the Political 
Districting Problem is to avoid Gerrymandering. It is generally accepted that there are three essential characteristics 
that the districts should have [6]: population equality, contiguity and geographical compactness. The task here is 
therefore to devise a method that is able to produce solutions which satisfy these characteristics. 

In this paper, we map the Districting problem onto a g-state Potts model. This would allow us to study the problem 
by using statistical physics methods. Most of the constraints that we mentioned above could be represented as the 
interaction terms among various sites of the g-state Potts model or the addition of an external field to the system. 
By doing so, we can also understand the corresponding physical nature of such a social problem. 

Using a physics model to study a social science problem is not new. There already appear many papers and 
books written on various subjects in social science. People have employed concepts such as scalings, etc, to study 
the social behavior in financial markets [7-9]. Statistical models have also been applied to NP-complete problems in 
combinatorial optimization [10]. We here demonstrate how a social economics problem can be transformed into a 
physics model and carry out an optimization study to look for the optimal solution of the problem. This paper is 
organized as follows. Section II is a description of the model for the redistricting problem. Section III contains the 
results of our numerical simulation and Section IV is the summary and discussion. 



II. THE MODEL 



The g-state Potts model was first proposed as an appropriate generalization of the Ising model, to consider a system 
of spins confined in a plane with each spin pointing to one of the q equally spaced directions specified by the angles 
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0„ = ,n = 0,l,...,«-l . (2.1) 

q 

In the most general form, the nearest-neighbor interaction would only depend on the relative angle between the two 
vectors. The Hamiltonian will then take the form 

H=-J2j(Q ij ), (2.2) 

ij 

where the function J(6) is 2-zr periodic and 0y = Q ni — Q nj is the angle between the two spins of neighboring sites 
i and j. 

In his seminal work, Potts [11] chose 



J(0) = eicosO , (2.3) 

and was able to determine the critical point of this (now known as planar Potts) model on the square lattice for 
q = 2,3, 4. As a remark, he also gave the critical point for all q of the following (now known as standard Potts) model 



AQij) = e 2 S Si , Si , (2.4) 

where Sij is the Kronecker delta and is equal to 1 when i = j and otherwise. It is also the model with interaction 
energy of the form in Eq. (2.4) that has attracted the most attention to date. In our study here, we shall use this 
standard Potts model as the starting point. 

To begin with, we assume each site, or g-state spin to represent a unit in the Districting problem and with a total 
of N units, q is the total number of districts in the plan. For a spin to be in one of the q states means that the unit 
belongs to that particular district. Define nj to be the number of sites in a particular state j. It is clear that 



jZn 3 =N . (2.5) 

The objective here is to include the constraints such as population equality, contiguity and geographical compactness 
as interaction energy terms in the Hamiltonian so that the ground state energy configuration corresponds to the 
optimal solution of the problem. 
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Achieving equal voter population size districts is central to any Political Districting Problem. A measure of this can 
be obtained, e.g., by calculating the sum of the differences between the population of each district and the average 
population over all districts. To include this in our model, we can view it as an external random field acting on site 
i with a field strength p t representing the voter population of this site. Therefore, the total voter population Pj of a 
district j is the sum of the interaction of the external random field and all the sites within this district and is given 

by 

Pi=Y>PiSs t j • (2-6) 
i=i 

and the total voter population of the plan is given by 

i=i 

The average voter population (P) for each district is therefore equal to Po/q. The difference between the voter 
population of district j and the average voter population will contribute to the Hamiltonian of the system. Let us 
define its total contribution to be Hp. It is then given by 



(2.8) 

Population equality alone will sometimes lead to problems of contiguity and compactness in districting, resulting in 
districts of unnatural shapes. Hence compactness is usually an important factor in any political districting solution. 
There are many ways to define the compactness of a district but there is yet no universally acceptable definition of 
compactness. Young [12] studied eight different measures of compactness and showed that each measure fails to give 
satisfactory results on certain geographical configurations. In short, any good measure of compactness must apply 
both to the districting as a whole and to each district individually. It should also be conceptually simple and should 
use easily collected and verifiable data. Our strategy here is to take compactness as the smallest total sum of all 
boundaries between different districts. In this way, we can view it as the interaction energy between domains, or the 
domain wall energy. Thus, the contribution between sites i and j to the domain wall energy is 

0--S Si ,s j )C ij , (2.9) 

where CV,- = 1, if i and j are neighboring sites and zero otherwise. Define the Hamiltonian energy from this interaction 
to be Hp, we will have 

H D =J2( 1 - S s i ,s j )C ij . (2.10) 

i,3 

Hp and Hp, are the two energy terms that we include in our Hamiltonian for the study of the system, which we 
now define as 

H = XpHp + XpiHp, , (2.11) 

where Ap and Ad are constant coefficients. Notice that the way we define Hp, would in most cases quarantee the 
constraint of contiguity, depending on the ratio between Ap and Ad- Other constraints can also be included as the 
interaction among various sites or external fields acting on the system and will be discussed in Sec. IV. 

III. NUMERICAL SIMULATION RESULTS 

In the above section, we have shown how to map the Districting Problem to a g-state Potts Model and rewrite the 
constraints into interaction between different sites and external fields acting on the system. In this section, we will 
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FIG. 1: The distribution of the number of precincts vs. precinct voter population in Taipei city. 
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FIG. 2: (a) A district of 10 precincts in our model; (b) each red dot represents one precinct; (c) a network of precincts connected 
by green arcs; (d) the network extracted from (c). 



use the problem of determining the districting for the Taiwan Legislature seats as an example, though the method can 
be equally applicable to any districting problem. We will also use Monte Carlo method to perform our simulation. 

Starting from the 2008 Legislature election, the political districts of Taiwan will be restructured, resulted in 73 
voter districts where each district will elect its own legislator for the Legislature. Taipei city, for example, will have 8 
voter districts. On the other hand, Taipei city has at this moment a total of 449 precincts (a precinct corresponds to 
a site in our model) and a voter population of about 2 million. Redisricting into 8 voter districts will imply that we 
need to regroup and put the precincts into different voter districts, which is a typical Political Districting Problem. 
In order to satisfy the population equality constraint, a voter population of about 250,000 voters in each voter district 
is preferred. 

Figure 1 shows the distribution of the number of precincts vs. precinct voter population in Taipei city. The y-axis 
is the number of precincts while the x-axis is the voter population in each precinct, with each bin representing 500 
voters. It is interesting to see that this can be approximated by a Gaussian distribution with a mean of about 4,100 
and a variance of about 1,450. This can in turn be interpreted as a Gaussian-distributed random field term in the 
g-state Potts model. We should mentiont here that g-state Potts model with Gaussian-distributed random fields is of 
tremendous interest in physics. It has been studied for a long time and will not be discussed here. 

In order to map this problem onto the g-state Potts model, we need to define what is meant by the neighbors of 
a site (precinct). Figure 2 is an illustration of our definition. It represents a district with 10 precincts, as shown in 
Figure 2(a). Each of the precincts is represented by a red dot, which is shown in Figure 2(b). The red dots correspond 
to the sites in our g-state Potts model. These red dots are then connected by green arcs, forming a network as shown 
in Figure 2(c). Figure 2(d) is the network extracted from Figure 2(c). We can further exploit the interesting properties 
of this example. Figure 3 is the distribution of the number of neighbors that each of the current 449 precincts in Taipei 
city has. We can call this the connectivity of a precinct. We can also approximate this by a Gaussian Distribution 
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FIG. 3: The distribution of the number of neighbors that each of the current 449 precincts in Taipei city has. 




FIG. 4: Energy of the system vs. temperature. 



with a mean of about 4.6 and a standard deviation about 1. This further supports the introduction of an interaction 
term among different sites in our discussion above. The preferred voter district boundaries here will cut through those 
precincts with fewer connections to avoid long and thin precincts (units) within a voter district in order to satisfy 
compactness. In fact, precincts with more connections will lie closer to the center of a voter district while precincts 
with few connections will stay near or at the district boundaries. 

The competition between the two terms in the Hamiltonian in Eq. (2.11) defines the statistical properties of the 
model. Figure 4 shows the energy of the system vs. temperature. As we can see, the system undergoes a phase 
transition around T = 2.0 with an energy E w 308, when simulated Xp and Xd are equal to 50 and 1 respectively. 
The phase transition here corresponds to the formation of domains, or the aggregation of precincts into compact voter 
districts. As mentioned above, the total number of sites in this system is 449 while q is equal to 8, the number of 
precincts and voter districts in Taipei respectively. To make sure there is no peculiar behavior in our system, we have 
actually started from T = 10 7 and gradually lowered the temperature and obtained a smooth curve all the way down 
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FIG. 5: Specific heat capacity Cv of our system vs. temperature. 

to T = 0.1 where there is no further change in the ground state energy. 

Figure 5 is a plot of the specific heat capacity Cy vs. temperature. One can see that there is a peak around 
temperature, T — 2.0, with a peak value of about 273, which is an indication that this is a phase transition, where 
large domains arc formed and condensed and the temperature at which the peak appears corresponds to the critical 
temperature. 

Since our example here is a finite system (with only 449 sites), finite size effect prevents the peak to blow up. In order 
to see that this model has a phase transition, we construct an artificial system which has similar interaction terms as 
our example but with a varying number of sites to demonstrate the critical properties in the thermodynamic limit. 
The artificial model is our g-state Potts model on a periodic two dimensional triangular lattice with a Hamiltonian 
of the form similar to Eq. (2.11). We again set q = 8 in this system and assume Gaussian distributions in both the 
voter population and the number of connections for the sites. Again, we normalize the total voter population to 1, 
independent of the number of sites. For the connections, since each site on the triangular lattice could at the most 
connect to six nearest neighbors, we normalize to have the peak of the Gaussian distribution to be at 3, and the 
cutofff at 6. Figure 6 is a plot of the energy of the artificial system vs. temperature. In the figure, we have included 
the energy of latttice sizes from 8 x 8 to 64 x 64. One can clearly see the sharpening of the transition around T w 1.6, 
with E about 990, again with Ap and Ad are equal to 50 and 1 respectively. 

Figure 7 is a plot of the specific heat capacity Cy vs. temperature of the system. We can here see clearly that Cy 
will diverge as the size of the lattice grows, which confirms a phase transition of the system. The peak value of Cy 
for the lattice 64 x 64 is about 2470. 

Figure 8(a) is a map of the Taipei city and its 449 precincts. As one lowers the temperature, the system will 
eventually reach its ground state. In reality, one cannot have absolute voter population equality for each of the voter 
districts. The number of near optimal solutions will increase if one allows the percentage difference of voter population 
of a district from the average voter population/district to increase. Figure 8(b) is an illustration of the voter districting 
with the lowest ground state energy from our simulation with Ap and Ad equal to 50 and 1 respectively. There arc a 
total of 8 voter districts, drawn in different colors. 

Table 1. Voter districting in Taipei city with different A a and Ad- 
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Table 1 is the optimal energy state (E min ) that we obtain with different values of Ap and Ad- Ep and Ep> are 
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the contributions to E m i n from the voter population (XpHp) and district boundary (XdHd) in Eq. (2.11). The 
voter population of each precinct is taken from the 2004 Legislature Election. Also included is the largest deviation 
(AP max ) of the voter population of a district from the average voter population (P) and the ratio (AP max /(P)). In 
the 2008 Legislature Election, the Central Election Commisson (CEC) of Taiwan constrains AP max / (P) of Taipei city 
to be less than 15%. On the other hand, one also wants to have a near minimal £x> in order to guarantee contiguity 
and compactness. Taking these into consideration, Ap somewhere between 10 and 100 is preferred for the districting. 
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IV. SUMMARY AND DISCUSSION 



In this paper, we show how to use a statistical physics model to study a social economics problem. We have mapped 
the Political Districting Problem to a g-state Potts model in which the constraints can be written as interaction between 
sites or external fields acting on the system. Districting into q voter districts is equivalent to finding the ground state 
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FIG. 8: (a) A map of the Taipei city and its 449 precincts; (b) an illustration of the voter districting in lowest energy state 
from our simulation with Ap and Ad equal to 50 and 1 respectively. The 8 voter districts are drawn in different colors. 

of the system. Searching for an optimal solution for the ground state becomes an optimization problem and standard 
optimization algorithms such as the Monte Carlo method or simulated annealing method can be employed here. 

The system undergoes a phase transition as one lowers the temperature. This transition can be understood as 
follows. At high temperature, only small domains are formed and the whole system is in a random state. As the 
temperature decreases, large domains will begin to form in order to lower the energy of the system. At the critical 
temperature, the system will form large domains and thus will approach the ground state configuration. 

In the example above, we studied the 2008 Taiwan Legislature Election with two constraints, viz. voter population 
equality and compactness. With a suitable choice of the ratio between A^> and Ap, the near optimal solutions also 
satisfy the contiguity condition. One can also add other interaction terms for extra constraints. In our example 
here, Taipei city itself has currently 12 administrative zones. The CEC of Taiwan prefers to have no more than 2 
administrative zones in each voter district. Hence the districting here corresponds to adding another constraint to 
the Hamiltonian. One can, for example, add another term to the Hamiltonian which takes the following form 

H A = \ A J2 Sst&Ss^sJstM 1 " bA t . A] ){l - 5 AjtAk ){l - 8 AkAi ) , (4.1) 

where A, here refers to the administrative zone that site i belongs to and i, j, k all belong to the same voter district. 
One can see that in (4.1), the right hand side will give a finite contribution if the sites within a voter district belong 
to 3 or more administrative zones. A large X A practically eliminates such a possibility. 

In general, one can include additional terms to the total Hamiltonian to take care of new constraints and the 
methodology we give here should equally be applicable to other districting problems. We have thus shown how one 
can use statistical physics approach to study "socio-econophysics" problems and demonstrate with the example of 
districting Taipei city in the 2008 Taiwan Legislature Election. 

This work was supported in part by the National Science Council, Taiwan, R.O.C. (grant no. NSC-94-2112-M-001- 
019). 
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